Integration tests for Catalog Federation#2344
Conversation
dimas-b
left a comment
There was a problem hiding this comment.
Thanks, @poojanilangekar ! The PR LGTM overall, but I have a concern with the IT test location (as commented).
...tests/src/main/java/org/apache/polaris/service/it/test/CatalogFederationIntegrationTest.java
Outdated
Show resolved
Hide resolved
runtime/service/src/intTest/java/org/apache/polaris/service/it/CatalogFederationIT.java
Show resolved
Hide resolved
...tests/src/main/java/org/apache/polaris/service/it/test/CatalogFederationIntegrationTest.java
Outdated
Show resolved
Hide resolved
...tests/src/main/java/org/apache/polaris/service/it/test/CatalogFederationIntegrationTest.java
Outdated
Show resolved
Hide resolved
| public Map<String, String> getConfigOverrides() { | ||
| return Map.of( | ||
| "polaris.features.\"ENABLE_CATALOG_FEDERATION\"", "true", | ||
| "polaris.features.\"ALLOW_OVERLAPPING_CATALOG_URLS\"", "true"); |
There was a problem hiding this comment.
Would it be possible to use 2 different base locations + 2 different storage configs for the two catalogs, and thus avoid setting this?
There was a problem hiding this comment.
We need to use the same base location because we are using a single Polaris instance. So unless we have two different Polaris instances (which we can't do in the current integration test setup), we need this configuration.
There was a problem hiding this comment.
@poojanilangekar : Do you mean that federated catalogs check for location overlaps with internal catalogs?
There was a problem hiding this comment.
Yes, that is correct.
Unless the ALLOW_OVERLAPPING_CATALOG_URLS flag is set, Polaris always checks if the base location of a CreateCatalog (irrespective of whether its INTERNAL vs EXTERNAL) request overlaps with an existing catalog. Therefore we need this flag to be set, at least for the JUnit tests.
There was a problem hiding this comment.
(This has come up before, but I think the fix is that federating EXTERNAL catalogs shouldn't need or support a location)
There was a problem hiding this comment.
Not at the moment. Making this change would require changing the API spec and making subsequent changes in the CreateCatalog validations.
There was a problem hiding this comment.
Would you be able to open an issue for sorting this out later?
There was a problem hiding this comment.
This PR is good to merge from my POV 👍
| public static class CatalogFederationProfile implements QuarkusTestProfile { | ||
| @Override | ||
| public Map<String, String> getConfigOverrides() { | ||
| return Map.of( |
There was a problem hiding this comment.
It might be good to explicitly set SUPPORTED_CATALOG_CONNECTION_TYPES
| newUserCredentials = managementApi.createPrincipal(PRINCIPAL_NAME); | ||
|
|
||
| FileStorageConfigInfo storageConfig = | ||
| FileStorageConfigInfo.builder() | ||
| .setStorageType(StorageConfigInfo.StorageTypeEnum.FILE) | ||
| .setAllowedLocations(List.of(baseLocation.toString())) | ||
| .build(); | ||
|
|
||
| CatalogProperties catalogProperties = new CatalogProperties(baseLocation.toString()); | ||
|
|
||
| Catalog localCatalog = | ||
| PolarisCatalog.builder() | ||
| .setType(Catalog.TypeEnum.INTERNAL) | ||
| .setName(LOCAL_CATALOG_NAME) | ||
| .setProperties(catalogProperties) | ||
| .setStorageConfigInfo(storageConfig) | ||
| .build(); | ||
| managementApi.createCatalog(localCatalog); | ||
|
|
||
| CatalogGrant catalogGrant = | ||
| CatalogGrant.builder() | ||
| .setType(CatalogGrant.TypeEnum.CATALOG) | ||
| .setPrivilege(CatalogPrivilege.TABLE_WRITE_DATA) | ||
| .build(); | ||
| managementApi.addGrant(LOCAL_CATALOG_NAME, CATALOG_ROLE_NAME, catalogGrant); | ||
| managementApi.assignPrincipalRole(PRINCIPAL_NAME, PRINCIPAL_ROLE_NAME); | ||
| CatalogRole localCatalogAdminRole = | ||
| managementApi.getCatalogRole(LOCAL_CATALOG_NAME, CATALOG_ROLE_NAME); | ||
| managementApi.grantCatalogRoleToPrincipalRole( | ||
| PRINCIPAL_ROLE_NAME, LOCAL_CATALOG_NAME, localCatalogAdminRole); | ||
|
|
||
| AuthenticationParameters authParams = | ||
| OAuthClientCredentialsParameters.builder() | ||
| .setAuthenticationType(AuthenticationParameters.AuthenticationTypeEnum.OAUTH) | ||
| .setTokenUri(endpoints.catalogApiEndpoint().toString() + "/v1/oauth/tokens") | ||
| .setClientId(newUserCredentials.getCredentials().getClientId()) | ||
| .setClientSecret(newUserCredentials.getCredentials().getClientSecret()) | ||
| .setScopes(List.of("PRINCIPAL_ROLE:ALL")) | ||
| .build(); | ||
| ConnectionConfigInfo connectionConfig = | ||
| IcebergRestConnectionConfigInfo.builder() | ||
| .setConnectionType(ConnectionConfigInfo.ConnectionTypeEnum.ICEBERG_REST) | ||
| .setUri(endpoints.catalogApiEndpoint().toString()) | ||
| .setRemoteCatalogName(LOCAL_CATALOG_NAME) | ||
| .setAuthenticationParameters(authParams) | ||
| .build(); | ||
| ExternalCatalog externalCatalog = | ||
| ExternalCatalog.builder() | ||
| .setType(Catalog.TypeEnum.EXTERNAL) | ||
| .setName(EXTERNAL_CATALOG_NAME) | ||
| .setConnectionConfigInfo(connectionConfig) | ||
| .setProperties(catalogProperties) | ||
| .setStorageConfigInfo(storageConfig) | ||
| .build(); | ||
| managementApi.createCatalog(externalCatalog); | ||
|
|
||
| managementApi.addGrant(EXTERNAL_CATALOG_NAME, CATALOG_ROLE_NAME, catalogGrant); | ||
| CatalogRole externalCatalogAdminRole = | ||
| managementApi.getCatalogRole(EXTERNAL_CATALOG_NAME, CATALOG_ROLE_NAME); | ||
| managementApi.grantCatalogRoleToPrincipalRole( | ||
| PRINCIPAL_ROLE_NAME, EXTERNAL_CATALOG_NAME, externalCatalogAdminRole); |
There was a problem hiding this comment.
We only have one test right now so it's fine, but if we add more this should get refactored out
dimas-b
left a comment
There was a problem hiding this comment.
LGTM, but let's resolve outstanding conversations before merging.
#2344 added a new test for catalog federation, but it looks like an undetected conflict with concurrent changes related to authentication have broken the test in main.
* Integration tests for Catalog Federation (apache#2344) Adds a Junit5 integration test for catalog federation. * Fix merge conflict in CatalogFederationIntegrationTest (apache#2420) apache#2344 added a new test for catalog federation, but it looks like an undetected conflict with concurrent changes related to authentication have broken the test in main. * chore(deps): update registry.access.redhat.com/ubi9/openjdk-21-runtime docker tag to v1.23-6.1755674729 (apache#2416) * 2334 (apache#2427) * Fix TableIdentifier in TaskFileIOSupplier (apache#2304) we cant just convert a `TaskEntity` to a `IcebergTableLikeEntity` as the `getTableIdentifier()` method will not return a correct value by using the name of the task and its parent namespace (which is empty?). task handlers instead need to pass in the `TableIdentifier` that they already inferred via `TaskEntity.readData`. * Fix NPE in CreateCatalog (apache#2435) * Doc fix: Access control page update (apache#2424) * 2418 * 2418 * fix(deps): update dependency software.amazon.awssdk:bom to v2.32.29 (apache#2443) * Optimize PolicyCatalog.listPolicies (apache#2370) this is a follow-up to apache#2290 the optimization is to use `listEntities` instead of `loadEntities` when there is `policyType` filter to apply * Add PolarisDiagnostics field to BaseMetaStoreManager (apache#2381) * Add PolarisDiagnostics field to BaseMetaStoreManager the ultimate goal is removing the `PolarisCallContext` parameter from every `PolarisMetaStoreManager` interface method, so we make steps towards reducing its usage first. * Add feature flag to disallow custom S3 endpoints (apache#2442) * Add new realm-level flag: `ALLOW_SETTING_S3_ENDPOINTS` (default: true) * Enforce in `PolarisServiceImpl.validateStorageConfig()` Fixes apache#2436 * Deprecate ActiveRolesProvider for removal (apache#2404) * Client: fix openapi verbose output, remove doc generate, and skip test generations (apache#2439) * Fix various issue in client code generation * Use logger instead of print * Add back exclude on __pycache__ as CI is not via Makefile * Add back exclude on __pycache__ as CI is not via Makefile * Add user principal tag in metrics (apache#2445) * Added API change to enable tag * Added test * Added production readiness check * fix(deps): update dependency io.opentelemetry.semconv:opentelemetry-semconv to v1.36.0 (apache#2454) * fix(deps): update dependency com.google.cloud:google-cloud-storage-bom to v2.56.0 (apache#2447) * fix(deps): update dependency gradle.plugin.org.jetbrains.gradle.plugin.idea-ext:gradle-idea-ext to v1.3 (apache#2428) * Build: Make jandex dependency used for index generation managed (apache#2431) Also allows specifying the jandex index version for the build. This is a preparation step contributing to apache#2204, once a jandex fix for reproducible builds is available. Co-authored-by: Alexandre Dutra <adutra@apache.org> * Built: improve reproducible archive files (apache#2432) As part of the effort for apache#2204, this change fixes a few aspects around reproducible builds: Some Gradle projects produce archive files, but don't get the necessary Gradle archive-tasks settings applied: one not-published project but also the tarball&zip of the distribution. This change moves the logic to the new build-plugin `polaris-reproducible`. Another change is to have some Quarkus generated jar files adhere to the same conventions, which are constant timestamps for the zip entries and a deterministic order of the entries. That's sadly not a full fix, as the classes that are generated or instumented by Quarkus differ in each build. Contributes to apache#2204 * Remove commons-lang3 dependency (apache#2456) outside of tests we can replace the functionality with jdk11 and guava. also stop using `org.assertj.core.util` as its a non-public api. * add refresh credentials property to loadTableResult (apache#2341) * add refresh credentials property to loadTableResult * IcebergCatalogAdapterTest: Added test to ensure refresh credentials endpoint is included * delegate refresh credential endpoint configuration to storage integration * GCP: Add refresh credential properties * fix(deps): update dependency io.opentelemetry.semconv:opentelemetry-semconv to v1.37.0 (apache#2458) * Add Delegator to all API Implementations (apache#2434) Per the Dev ML, implements the Delegator pattern to add Events instrumentation to all Polaris APIs. * Prefer java.util.Base64 over commons-codec (apache#2463) `java.util.Base64` is available since java8 and we are already using it in a few other spots. in a follow-up we might be able to get rid of our `commons-codec` dependency completely. * Service: Move tests to the right package (apache#2469) * Update versions in runtime LICENSE and NOTICE (apache#2468) * fix(deps): update dependency com.adobe.testing:s3mock-testcontainers to v4.8.0 (apache#2475) * fix(deps): update dependency com.gradleup.shadow:shadow-gradle-plugin to v9.1.0 (apache#2476) * Service: Remove hadoop-common from polaris-runtime-service (apache#2462) * Service: Always validate allowed locations from Storage Config (apache#2473) * Add Community Sync Meeting 20250828 (apache#2477) * Update dependency software.amazon.awssdk:bom to v2.33.0 (apache#2483) * Remove PolarisCallContext.getDiagServices (apache#2415) * Remove PolarisCallContext.getDiagServices usage * Remove diagnostics from PolarisCallContext * Feature: Expose resetCredentials via a new reset api to allow root user to reset credentials for an existing principal with custom values (apache#2197) * Add type-check to PolarisEntity subclass ctors (apache#2302) currently one can freely "cast" any `PolarisEntity` to a more specific type via their constructors. this can lead to subtle bugs like we fixed in a29f800 by adding type checks we discover a few more places where we need to be more careful about how we construct new or handle existing entities. note that we can add a check for `PolarisEntitySubType` in a followup, but it requires more fixes currently. * Fix CI (apache#2489) Fix undetected merge conflict after apache#2197 + apache#2415 + apache#2434 * Use local diagnostics in TransactionWorkspaceMetaStoreManager * Add resetCredentials to PolarisPrincipalsEventServiceDelegator * Core: Prevent AIOOBE for negative codes in PolarisEntityType, PolarisPrivilege, ReturnStatus (apache#2490) * feat(idgen): Start Implementation of NoSQL with the ID Generation Framework (apache#2131) Create an ID Generation Framework. Related to apache#650 & apache#844 Co-authored-by: Robert Stupp <snazy@snazy.de> Co-authored-by: Dmitri Bourlatchkov <dmitri.bourlatchkov@gmail.com> * perf(refactor): optimizing JdbcBasePersistenceImpl.listEntities (apache#2465) - Reduced Column Selection: Only 6 columns instead of 16 - Eliminated Object Creation Overhead: Direct conversion to EntityNameLookupRecord without intermediate PolarisBaseEntity * Add Polaris Events to Persistence (apache#1844) * AWS CloudWatch Event Sink Implementation (apache#1965) * Fix failing CI (apache#2498) * Update actions/stale digest to 3a9db7e (apache#2499) * Core: Prevent AIOOBE for negative policy codes in PredefinedPolicyType (apache#2486) * Service: Add location tests for views (apache#2496) * Update docker.io/jaegertracing/all-in-one Docker tag to v1.73.0 (apache#2500) * Update dependency io.netty:netty-codec-http2 to v4.2.5.Final (apache#2495) * Update actions/setup-python action to v6 (apache#2502) * Update the Release Guide about the Helm Chart package (apache#2179) * Update the Release Guide about the Helm Chart package * Update release-guide.md Co-authored-by: Pierre Laporte <pierre@pingtimeout.fr> * Add missing commit message * Whitespace * Use Helm GPG plugin to sign the Helm chart * Fix directories during Helm chart copy to SVN * Add Helm index to SVN * Use long name for svn checkout * Ensure the Helm index is updated after the chart is moved to SVN dist release * Do not publish any Docker image before the vote succeeds * Typos * Revert "Do not publish any Docker image before the vote succeeds" This reverts commit 5617e65. * Don't mention Helm values.yaml in the release guide as it doesn't contain version details --------- Co-authored-by: Pierre Laporte <pierre@pingtimeout.fr> * Update dependency com.azure:azure-sdk-bom to v1.2.38 (apache#2503) * Update registry.access.redhat.com/ubi9/openjdk-21-runtime Docker tag to v1.23-6.1756793420 (apache#2504) * Remove commons-codec dependency (apache#2474) follow-up to f8ad77a we can simply use guava instead and eliminate the extra dependency * CLI: Remove SCRIPT_DIR and default config location to user home (apache#2448) * Remove readInternalProperties helpers (apache#2506) the functionality is already provided by the `PrincipalEntity` * Add Events for Generic Table APIs (apache#2481) This PR adds the Events instrumentation for the Generic Tables Service APIs, surrounding the default delegated call to the business logic APIs. * Disable custom namespace locations (apache#2422) When we create a namespace or alter its location, we must confirm that this location is within the parent location. This PR introduces introduces a check similar to the one we have for tables, where custom locations are prohibited by default. This functionality is gated behind a new behavior change flag `ALLOW_NAMESPACE_CUSTOM_LOCATION`. In addition to allowing us to revert to the old behavior, this flag allows some tests relying on arbitrarily-located namespaces to pass (such as those from upstream Iceberg). Fixes: apache#2417 * fix for IcebergAllowedLocationTest (apache#2511) * Remove unused config from SparkSessionBuilder (apache#2512) Tests pass without it. * Add Events for Policy Service APIs (apache#2479) * Remove PolarisTestMetaStoreManager.jsonNode helper (apache#2513) * Update dependency software.amazon.awssdk:bom to v2.33.4 (apache#2517) * Update dependency com.nimbusds:nimbus-jose-jwt to v10.5 (apache#2514) * Update dependency io.opentelemetry:opentelemetry-bom to v1.54.0 (apache#2515) * Update dependency io.micrometer:micrometer-bom to v1.15.4 (apache#2519) * Port missed OSS change * NoSQL: adopt to updated test packages * NoSQL: adapt to removed PolarisDiagnostics param * NoSQL: fix libs.versions.toml * NoSQL: include jandex plugin related changes from OSS * NoSQL: changes for delete/set principal client-ID+secret * Last merged commit c6176dc --------- Co-authored-by: Pooja Nilangekar <poojan@umd.edu> Co-authored-by: Eric Maynard <eric.maynard+oss@snowflake.com> Co-authored-by: Mend Renovate <bot@renovateapp.com> Co-authored-by: Yong Zheng <yongzheng0809@gmail.com> Co-authored-by: Christopher Lambert <xn137@gmx.de> Co-authored-by: Honah (Jonas) J. <honahx@apache.org> Co-authored-by: Dmitri Bourlatchkov <dmitri.bourlatchkov@gmail.com> Co-authored-by: Alexandre Dutra <adutra@apache.org> Co-authored-by: fivetran-kostaszoumpatianos <kostas.zoumpatianos@fivetran.com> Co-authored-by: Jason <jasonf20@gmail.com> Co-authored-by: Adnan Hemani <adnan.h@berkeley.edu> Co-authored-by: Yufei Gu <yufei@apache.org> Co-authored-by: JB Onofré <jbonofre@apache.org> Co-authored-by: fivetran-arunsuri <103934371+fivetran-arunsuri@users.noreply.github.com> Co-authored-by: Adam Christian <105929021+adam-christian-software@users.noreply.github.com> Co-authored-by: Artur Rakhmatulin <artur.rakhmatulin@gmail.com> Co-authored-by: Pierre Laporte <pierre@pingtimeout.fr>
Adds a Junit5 integration test for catalog federation.